4,131 research outputs found
Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
In text-video retrieval, recent works have benefited from the powerful
learning capabilities of pre-trained text-image foundation models (e.g., CLIP)
by adapting them to the video domain. A critical problem for them is how to
effectively capture the rich semantics inside the video using the image encoder
of CLIP. To tackle this, state-of-the-art methods adopt complex cross-modal
modeling techniques to fuse the text information into video frame
representations, which, however, incurs severe efficiency issues in large-scale
retrieval systems as the video representations must be recomputed online for
every text query. In this paper, we discard this problematic cross-modal fusion
process and aim to learn semantically-enhanced representations purely from the
video, so that the video representations can be computed offline and reused for
different texts. Concretely, we first introduce a spatial-temporal "Prompt
Cube" into the CLIP image encoder and iteratively switch it within the encoder
layers to efficiently incorporate the global video semantics into frame
representations. We then propose to apply an auxiliary video captioning
objective to train the frame representations, which facilitates the learning of
detailed video semantics by providing fine-grained guidance in the semantic
space. With a naive temporal fusion strategy (i.e., mean-pooling) on the
enhanced frame representations, we obtain state-of-the-art performances on
three benchmark datasets, i.e., MSR-VTT, MSVD, and LSMDC.Comment: to be appeared in ICCV202
Identity-Consistent Aggregation for Video Object Detection
In Video Object Detection (VID), a common practice is to leverage the rich
temporal contexts from the video to enhance the object representations in each
frame. Existing methods treat the temporal contexts obtained from different
objects indiscriminately and ignore their different identities. While
intuitively, aggregating local views of the same object in different frames may
facilitate a better understanding of the object. Thus, in this paper, we aim to
enable the model to focus on the identity-consistent temporal contexts of each
object to obtain more comprehensive object representations and handle the rapid
object appearance variations such as occlusion, motion blur, etc. However,
realizing this goal on top of existing VID models faces low-efficiency problems
due to their redundant region proposals and nonparallel frame-wise prediction
manner. To aid this, we propose ClipVID, a VID model equipped with
Identity-Consistent Aggregation (ICA) layers specifically designed for mining
fine-grained and identity-consistent temporal contexts. It effectively reduces
the redundancies through the set prediction strategy, making the ICA layers
very efficient and further allowing us to design an architecture that makes
parallel clip-wise predictions for the whole video clip. Extensive experimental
results demonstrate the superiority of our method: a state-of-the-art (SOTA)
performance (84.7% mAP) on the ImageNet VID dataset while running at a speed
about 7x faster (39.3 fps) than previous SOTAs.Comment: to be appeared at ICCV202
Minimizing the number of edges in -saturated bipartite graphs
This paper considers an edge minimization problem in saturated bipartite
graphs. An by bipartite graph is -saturated if does not
contain a subgraph isomorphic to but adding any missing edge to creates
a copy of . More than half a century ago, Wessel and Bollob\'as
independently solved the problem of minimizing the number of edges in
-saturated graphs, where is the `ordered' complete
bipartite graph with vertices from the first color class and from the
second. However, the very natural `unordered' analogue of this problem was
considered only half a decade ago by Moshkovitz and Shapira. When , it can
be easily checked that the unordered variant is exactly the same as the ordered
case. Later, Gan, Kor\'andi, and Sudakov gave an asymptotically tight bound on
the minimum number of edges in -saturated by bipartite graphs,
which is only smaller than the conjecture of Moshkovitz and Shapira by an
additive constant. In this paper, we confirm their conjecture for with
the classification of the extremal graphs. We also improve the estimates of
Gan, Kor\'andi, and Sudakov for general and , and for all sufficiently
large .Comment: Reflected minor suggestions from reviewer
Vertex Downgrading to Minimize Connectivity
We consider the problem of interdicting a directed graph by deleting nodes with the goal of minimizing the local edge connectivity of the remaining graph from a given source to a sink. We introduce and study a general downgrading variant of the interdiction problem where the capacity of an arc is a function of the subset of its endpoints that are downgraded, and the goal is to minimize the downgraded capacity of a minimum source-sink cut subject to a node downgrading budget. This models the case when both ends of an arc must be downgraded to remove it, for example. For this generalization, we provide a bicriteria (4,4)-approximation that downgrades nodes with total weight at most 4 times the budget and provides a solution where the downgraded connectivity from the source to the sink is at most 4 times that in an optimal solution. We accomplish this with an LP relaxation and rounding using a ball-growing algorithm based on the LP values. We further generalize the downgrading problem to one where each vertex can be downgraded to one of k levels, and the arc capacities are functions of the pairs of levels to which its ends are downgraded. We generalize our LP rounding to get a (4k,4k)-approximation for this case
Plasma exosomal microRNAs are non-invasive biomarkers of moyamoya disease: A pilot study
Background: As a progressive cerebrovascular disease, Moyamoya Disease (MMD) is a common cause of stroke in children and adults. However, the early biomarkers and pathogenesis of MMD remain poorly understood.
Methods and material: This study was conducted using plasma exosome samples from MMD patients. Next-generation high-throughput sequencing, real-time quantitative PCR, gene ontology analysis, and Kyoto Encyclopaedia of Genes and Genomes pathway analysis of ideal exosomal miRNAs that could be used as potential biomarkers of MMD were performed. The area under the Receiver Operating Characteristic (ROC) curve was used to evaluate the sensitivity and specificity of biomarkers for predicting events.
Results: Exosomes were successfully isolated and miRNA-sequence analysis yielded 1,002 differentially expressed miRNAs. Functional analysis revealed that they were mainly enriched in axon guidance, regulation of the actin cytoskeleton and the MAPK signaling pathway. Furthermore, 10 miRNAs (miR-1306-5p, miR-196b-5p, miR-19a-3p, miR-22-3p, miR-320b, miR-34a-5p, miR-485-3p, miR-489-3p, miR-501-3p, and miR-487-3p) were found to be associated with the most sensitive and specific pathways for MMD prediction.
Conclusions: Several plasma secretory miRNAs closely related to the development of MMD have been identified, which can be used as biomarkers of MMD and contribute to differentiating MMD from non-MMD patients before digital subtraction angiography
Cyclically 5-Connected Graphs
Tutte's Four-Flow Conjecture states that every bridgeless, Petersen-free graph admits a nowhere-zero 4-flow. This hard conjecture has been open for over half a century with no significant progress in the first forty years. In the recent decades, Robertson, Thomas, Sanders and Seymour has proved the cubic version of this conjecture. Their strategy involved the study of the class of cyclically 5-connected cubic graphs. It turns out a minimum counterexample to the general Four-Flow Conjecture is also cyclically 5-connected. Motivated by this fact, we wish to find structural properties of this class in hopes of producing a list of minor-minimal cyclically 5-connected graphs
5-HydrÂoxy-1-(3-hydrÂoxy-2-naphthoÂyl)-3,5-dimethyl-2-pyrazoline
In the title molÂecule, C16H16N2O3, intraÂmolecular O—H⋯O hydrogen bonds influence the molÂecular conformation. InterÂmolecular O—H⋯O hydrogen bonds [O⋯O = 2.922 (2) Å] link the molÂecules into centrosymmetric dimers. Weak interÂmolecular C—H⋯O interÂactions assemble these dimers into layers parallel to the bc plane
- …